Amplicon Analysis at Microsynth
Objective: Amplicon analysis may be applied to various ends (in a non-resequencing context it may be metagenomics or any other inventory-type analysis).
Scope: Microsynth helps in this endeavour by sequencing and bioinformatics analysis.
The bioinformatics part of the analysis is described in the following.
Principal Workflow
- Read quality filtering
- Read denoising/clustering/dereplication
- Downstream analysis
Principal Dataflow
- Input: demultiplexed and quality filtered reads in fastq format for each sample
- Enrichment: read denoising/clustering/dereplication and aggregation with supplementary data (e.g., taxonomies)
- Output: distribution of clusters in samples and associated statistics
Data Formats
Specialized data formats are listed below (html, pdf, xlsx and txt are excluded).
- gz is a data compression format (cmp. "zip") and may be decompressed with 7-Zip for instance
- fasta/fastq format store sequence information and can be inspected using text editors (e.g., Notepad++)
- tab/tsv format separates data columns by tabulated white space (open with a text editor or Excel)
- biom format represents biological samples by observation contingency tables (open with a text editor or specilized software)
- RData binary representation of R objects (use R to load and handle)
- newick is a plain text format used here to describe phylogenetic trees of sequences (open with a text editor or specialized software)
Anatomy of Overview Page
Input Quality Assessment
This section is devoted to detailing key steps in quality control.
- FastQC html files for sequencing quality assessment
Amplicon Analysis
- Read cluster summaries and statistics
- Downstream analysis (differential analysis; if ordered)
- Functional profiling (if ordered)
Entire Analysis
The folder containing all generated files for the analysis